CRB-Tree: An Efficient Indexing Scheme for Range-Aggregate Queries
نویسندگان
چکیده
We propose a new indexing scheme, called the CRB-tree, for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in R, compute the aggregate of weights of points that lie inside a d-dimensional query rectangle. In this paper we focus on range-COUNT, SUM, AVG aggregates. First, we develop an indexing scheme for answering two-dimensional range-COUNT queries that usesO(N/B) disk blocks and answers a query in O(logB N) I/Os, where N is the number of input points and B is the disk block size. This is the first optimal index structure for the 2D rangeCOUNT problem. The index can be extended to obtain a near-linear-size structure for answering range-SUM queries using O(logB N) I/Os. We also obtain similar bounds for rectangle-intersection aggregate queries, in which the input is a set of weighted rectangles and a query asks to compute the aggregate of the weights of those input rectangles that overlap with the query rectangle. This result immediately improves a recent result on temporal-aggregate queries. Our indexing scheme can be dynamized and extended to higher dimensions. Finally, we demonstrate the practical efficiency of our index by comparing its performance against kdB-tree. For a dataset of around 100 million points, the CRB-tree query time is 8–10 times faster than the kdB-tree query time. Furthermore, unlike other indexing schemes, the query performance of CRB-tree is oblivious to the distribution of the input points and placement, shape and size of the query rectangle.
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملAn Efficient Indexing Method for Box Queries in NDDS Spaces using BoND-tree
Similarity searches in multidimensional Non-ordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as bioinformatics, biometrics, data mining and E-commerce. Efficient similarity searches require robust indexing techniques. Box queries (or window queries) are a type of query which specifies a set of allowed values in each dimension. Unfortunately, exi...
متن کاملIndexing range sum queries in spatio-temporal databases
Although spatio-temporal databases have received considerable attention recently, there has been little work on processing range sum queries on the historical records of moving objects despite their importance. Since the direct access to a huge amount of data to answer range sum queries incurs prohibitive computation cost, materialization techniques based on existing index structures are sugges...
متن کاملEfficient Execution of Range-Aggregate Queries in Data Warehouse Environments
Range-aggregate queries on the data cube are powerful tools for analysis in data warehouse environments. Cubetree is a technique materializing a data cube through an R-tree. It provides efficient data accessibility, but involves some drawbacks to traverse all the internal and leaf nodes within given query ranges to compute range-aggregate queries. In this paper, we propose a novel index structu...
متن کاملContinuous Range Monitoring of Mobile Objects in Road Networks
In contrast to regular queries that are evaluated only once, a continuous query remains active over a period of time and has to be continuously evaluated to provide up to date answers. We propose a method for continuous range query processing for different types of queries, characterized by mobility of objects and/or queries which all follow paths in an underlying spatial network. The method as...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003